The US financial markets watchdog, the SEC, gave the go-ahead on Wednesday for the listing of a new bitcoin investment product, a decision seen as a major step towards the adoption of cryptocurrencies, which could shake up the sector. The aim of our work is to analyze the impact of the Security Exchange Comission's 10/10/2024 announcements (SEC decision) on bitcoin transactions.
The SEC, the US financial markets watchdog, has given the go-ahead for the listing of an ETF (exchange traded fund), an index fund that enables investors to profit from bitcoin's developments without investing their money directly in the digital currency. In concrete terms, investors buy shares in the fund, which they can sell at any time, rather than bitcoin. The fund's assets are invested in cryptocurrencies.
The market had already reached a milestone, in October 2021, with the listing of the first ETF invested not directly in bitcoin but in futures contracts linked to the cryptocurrency. Until now, access to digital currencies required opening an account on a cryptocurrency exchange platform and converting a traditional currency (issued by a central bank), such as the dollar.
By accepting the application of 11 investment companies, including BlackRock, the SEC has authorised a new bitcoin investment product. This is a major step forward for the adoption of cryptocurrencies, which could revolutionise the industry. This approval could "attract a significant amount of capital to the sector and is expected to have a potentially massive impact on valuations across the entire cryptocurrency sector", Swissquote analyst Ipek Ozkardeskaya.
Many investors had anticipated this authorisation. Over the last twelve months, the price of the leading cryptocurrency by capitalisation has soared by more than 150%. Bitcoin is now worth more than 45,000 dollars. In an uncertain geopolitical climate, some investors are also tending to turn to cryptoassets - which are reputed to be volatile - as a safe haven, BlackRock CEO Larry Fink recently said.
#!pip install nxviz
import os
import pandas as pd
import numpy as np
### Ondine path ###
# path = os.chdir('/Users/ondineji/Library/Mobile Documents/com~apple~CloudDocs/Documents/ACADEMIQUE/M2 FTD/S2/Network Analysis in Python /projet/data/')
### Hugo path ###
path = os.chdir('C:/Users/humic/OneDrive/Documents/Ecole/SorbonneFTD/Cours/Data_Mining_Network/project/data/')
pd.set_option('display.max_columns', None)
import warnings
warnings.filterwarnings("ignore")
# import data viz libs
import matplotlib.pyplot as plt
from matplotlib import cm
# Import graph libs
import networkx as nx
from nxviz import CircosPlot, ArcPlot
from cdlib import algorithms
# import custom functions
from utils import *
Note: to be able to use all crisp methods, you need to install some additional packages: {'infomap', 'wurlitzer', 'graph_tool', 'bayanpy', 'leidenalg'}
Note: to be able to use all crisp methods, you need to install some additional packages: {'ASLPAw', 'pyclustering'}
Note: to be able to use all crisp methods, you need to install some additional packages: {'infomap', 'wurlitzer', 'leidenalg'}
import yfinance as yf
def get_bitcoin_stock_prices(start_date, end_date):
"""
Retrieves daily Bitcoin stock prices for the specified date range.
Parameters:
start_date (str): Start date in the format 'YYYY-MM-DD'.
end_date (str): End date in the format 'YYYY-MM-DD'.
Returns:
DataFrame: DataFrame containing daily Bitcoin stock prices for the specified date range.
"""
# Define the ticker symbol for Bitcoin (BTC-USD)
ticker_symbol = 'BTC-USD'
# Create ticker object
ticker = yf.Ticker(ticker_symbol)
# Get historical data for the specified date range
historical_data = ticker.history(start=start_date, end=end_date)
return historical_data
# Example usage
start_date = '2024-01-01'
end_date = '2024-01-10'
bitcoin_prices_df = get_bitcoin_stock_prices(start_date, end_date)
bitcoin_prices_df
| Open | High | Low | Close | Volume | Dividends | Stock Splits | |
|---|---|---|---|---|---|---|---|
| Date | |||||||
| 2024-01-01 00:00:00+00:00 | 42280.234375 | 44175.437500 | 42214.976562 | 44167.332031 | 18426978443 | 0.0 | 0.0 |
| 2024-01-02 00:00:00+00:00 | 44187.140625 | 45899.707031 | 44176.949219 | 44957.968750 | 39335274536 | 0.0 | 0.0 |
| 2024-01-03 00:00:00+00:00 | 44961.601562 | 45503.242188 | 40813.535156 | 42848.175781 | 46342323118 | 0.0 | 0.0 |
| 2024-01-04 00:00:00+00:00 | 42855.816406 | 44770.023438 | 42675.175781 | 44179.921875 | 30448091210 | 0.0 | 0.0 |
| 2024-01-05 00:00:00+00:00 | 44192.980469 | 44353.285156 | 42784.718750 | 44162.691406 | 32336029347 | 0.0 | 0.0 |
| 2024-01-06 00:00:00+00:00 | 44178.953125 | 44227.632812 | 43475.156250 | 43989.195312 | 16092503468 | 0.0 | 0.0 |
| 2024-01-07 00:00:00+00:00 | 43998.464844 | 44495.570312 | 43662.230469 | 43943.097656 | 19330573863 | 0.0 | 0.0 |
| 2024-01-08 00:00:00+00:00 | 43948.707031 | 47218.000000 | 43244.082031 | 46970.503906 | 42746192015 | 0.0 | 0.0 |
| 2024-01-09 00:00:00+00:00 | 46987.640625 | 47893.699219 | 45244.714844 | 46139.730469 | 39821290992 | 0.0 | 0.0 |
| Column | Description |
|---|---|
| id | Unique identifier for each block record. |
| transaction_hash | Hash of the block's transaction. |
| time | Timestamp indicating when the block was mined, in Unix epoch format. |
| median_time | Median timestamp of the block. |
| size | Size of the block in bytes. |
| stripped_size | Size of the block after removing witness data (for Segregated Witness transactions). |
| weight | The weight of the block (considering transaction size and witness data), used in Segregated Witness. |
| version | The version of the block. |
| version_hex | Hexadecimal representation of the block version. |
| version_bits | Version bits for signaling soft forks. |
| merkle_root | Root of the Merkle tree of all transactions in the block. |
| nonce | A random value that miners increment to find a valid proof-of-work. |
| bits | Target threshold for the proof-of-work hash. |
| difficulty | Difficulty of finding a valid block hash. |
| chainwork | Cumulative work done to build the blockchain. |
| coinbase_data_hex | Hexadecimal representation of the coinbase transaction data. |
| transaction_count | Total number of transactions in the block. |
| witness_count | Number of witness transactions (for Segregated Witness). |
| input_count | Total number of inputs in all transactions. |
| output_count | Total number of outputs in all transactions. |
| input_total | Total value of all inputs in the block (in bitcoin). |
| input_total_usd | Total value of all inputs in the block (in USD). |
| output_total | Total value of all outputs in the block (in bitcoin). |
| output_total_usd | Total value of all outputs in the block (in USD). |
| fee_total | Total transaction fees collected by miners for all transactions in the block (in bitcoin). |
| fee_total_usd | Total transaction fees collected by miners for all transactions in the block (in USD). |
| fee_per_kb | Average transaction fee per kilobyte of the block. |
| fee_per_kb_usd | Average transaction fee per kilobyte of the block (in USD). |
| fee_per_kwu | Average transaction fee per kiloweight unit (Segregated Witness weight unit) of the block. |
| fee_per_kwu_usd | Average transaction fee per kiloweight unit of the block (in USD). |
| cdd_total | Cumulative Days Destroyed (a measure of how much existing coin supply is transacted). |
| generation | Reward generated by the coinbase transaction (in bitcoin). |
| generation_usd | Reward generated by the coinbase transaction (in USD). |
| reward | Total reward for miners (including transaction fees) for mining the block (in bitcoin). |
| reward_usd | Total reward for miners (including transaction fees) for mining the block (in USD). |
| guessed_miner | Estimation or guess of the mining pool or entity that mined the block based on analysis or heuristics. |
tsv_block = "dump_bitcoin_10_01_2024/blockchair_bitcoin_blocks_20240110.tsv"
# Read the TSV file into a pandas DataFrame
df_blocks = pd.read_csv(tsv_block, sep='\t')
df_blocks = df_blocks.rename(columns={'hash': 'transaction_hash'})
# Display the DataFrame
print("#### Block dimension ####")
print(df_blocks.shape)
print("\n#### Block attributes ####")
print(df_blocks.columns.tolist())
df_blocks
#### Block dimension #### (133, 36) #### Block attributes #### ['id', 'transaction_hash', 'time', 'median_time', 'size', 'stripped_size', 'weight', 'version', 'version_hex', 'version_bits', 'merkle_root', 'nonce', 'bits', 'difficulty', 'chainwork', 'coinbase_data_hex', 'transaction_count', 'witness_count', 'input_count', 'output_count', 'input_total', 'input_total_usd', 'output_total', 'output_total_usd', 'fee_total', 'fee_total_usd', 'fee_per_kb', 'fee_per_kb_usd', 'fee_per_kwu', 'fee_per_kwu_usd', 'cdd_total', 'generation', 'generation_usd', 'reward', 'reward_usd', 'guessed_miner']
| id | transaction_hash | time | median_time | size | stripped_size | weight | version | version_hex | version_bits | merkle_root | nonce | bits | difficulty | chainwork | coinbase_data_hex | transaction_count | witness_count | input_count | output_count | input_total | input_total_usd | output_total | output_total_usd | fee_total | fee_total_usd | fee_per_kb | fee_per_kb_usd | fee_per_kwu | fee_per_kwu_usd | cdd_total | generation | generation_usd | reward | reward_usd | guessed_miner | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 825071 | 000000000000000000031cb6b4df6a1238403d07055b05... | 2024-01-10 00:35:31 | 2024-01-09 23:27:46 | 1598270 | 798354 | 3993332 | 583933952 | 22ce2000 | 100010110011100010000000000000 | 3e1746a2512b8946846757735e834d5819f538eb8366e9... | 1209507708 | 386127977 | 73197634206448 | 000000000000000000000000000000000000000064d924... | 03ef960c1462696e616e63652f313030382e00a0020f93... | 2962 | 2751 | 7090 | 12208 | 4384137053122 | 2015519360 | 4384762053122 | 2.015807e+09 | 112190208 | 51577.2031 | 70211.560 | 32.2786 | 28104.379 | 12.9203 | 165530.082047 | 625000000 | 287331.25 | 737190208 | 338908.4688 | Binance |
| 1 | 825072 | 00000000000000000002cc13fbb98f2edf738fd007b138... | 2024-01-10 01:03:55 | 2024-01-09 23:27:57 | 1649184 | 781269 | 3992991 | 615956480 | 24b6c000 | 100100101101101100000000000000 | 21640b5e51e67c566033f802db0572b3fafa7f217afd96... | 2083341607 | 386127977 | 73197634206448 | 000000000000000000000000000000000000000064d966... | 03f0960c0497ec9d652f466f756e647279205553412050... | 2432 | 2260 | 8452 | 8875 | 5102700579174 | 2345864448 | 5103325579174 | 2.346152e+09 | 95000389 | 43674.5273 | 57614.855 | 26.4873 | 23798.223 | 10.9407 | 275368.332663 | 625000000 | 287331.25 | 720000389 | 331005.7812 | Foundry USA Pool |
| 2 | 825073 | 000000000000000000016a8554507d61e6dbc70a4216de... | 2024-01-10 01:17:27 | 2024-01-09 23:28:21 | 1589258 | 801401 | 3993461 | 660176896 | 27598000 | 100111010110011000000000000000 | 7ce942ef3713356330b5595b8197fe6e506e7cd4ebc454... | 70100840 | 386127977 | 73197634206448 | 000000000000000000000000000000000000000064d9a9... | 03f1960c172f5669614254432f4d696e65642062792061... | 3332 | 3089 | 7553 | 10199 | 1164968002118 | 535570752 | 1165593002118 | 5.358581e+08 | 75508770 | 34713.6484 | 47523.836 | 21.8482 | 18915.113 | 8.6958 | 248606.733430 | 625000000 | 287331.25 | 700508770 | 322044.9062 | ViaBTC |
| 3 | 825074 | 00000000000000000000104d23d4b3d2c220ea2555c4a0... | 2024-01-10 01:17:57 | 2024-01-09 23:39:32 | 1782086 | 737022 | 3993152 | 551550976 | 20e00000 | 100000111000000000000000000000 | 3298322ba382ce0b3c76dea2c4fbbaff90ff9af01f71ce... | 4050443161 | 386127977 | 73197634206448 | 000000000000000000000000000000000000000064d9eb... | 03f2960c082f5669614254432f2cfabe6d6d8ec77aa8ad... | 2147 | 1993 | 6189 | 8087 | 699464942935 | 321565024 | 700089942935 | 3.218524e+08 | 45535828 | 20934.1855 | 25557.459 | 11.7493 | 11407.536 | 5.2446 | 244141.365342 | 625000000 | 287331.25 | 670535828 | 308265.4375 | ViaBTC |
| 4 | 825075 | 0000000000000000000128f2f60d43b78b9459b423ae17... | 2024-01-10 01:26:31 | 2024-01-09 23:45:38 | 1932353 | 687086 | 3993611 | 604389376 | 24064000 | 100100000001100100000000000000 | 12289ed1d94938edd2bb59ac9bc0301c8495880c2b7c8b... | 2337013785 | 386127977 | 73197634206448 | 000000000000000000000000000000000000000064da2e... | 03f3960c194d696e656420627920416e74506f6f6c2097... | 2072 | 1971 | 8827 | 6012 | 577419223479 | 265456944 | 578044223479 | 2.657443e+08 | 65839947 | 30268.5996 | 34080.800 | 15.6681 | 16493.720 | 7.5828 | 246041.961771 | 625000000 | 287331.25 | 690839947 | 317599.8438 | AntPool |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 128 | 825199 | 0000000000000000000056407ba45b95b4697e0b78bc96... | 2024-01-10 22:58:25 | 2024-01-10 22:22:53 | 2274015 | 574598 | 3997809 | 647225344 | 2693e000 | 100110100100111110000000000000 | c30a02251c49123f820cfaccadbdfb26338ddd41dd03a5... | 3073202514 | 386127977 | 73197634206448 | 000000000000000000000000000000000000000064fa6d... | 036f970c2cfabe6d6d4965851c60167dca5ba4367b311e... | 2462 | 2362 | 4846 | 7027 | 320955941876 | 147553072 | 321580941876 | 1.478404e+08 | 50167919 | 23063.6973 | 22066.178 | 10.1444 | 12554.720 | 5.7719 | 11822.013023 | 625000000 | 287331.25 | 675167919 | 310394.9375 | F2Pool |
| 129 | 825200 | 0000000000000000000195e521b2ee04d52bdac56069de... | 2024-01-10 23:03:46 | 2024-01-10 22:31:11 | 1703348 | 764891 | 3998021 | 695418880 | 29734000 | 101001011100110100000000000000 | 40a7d67da1e57a8f35f8b0738b58957e7596d98cd9f16f... | 1728276037 | 386127977 | 73197634206448 | 000000000000000000000000000000000000000064fab0... | 0370970c2cfabe6d6d0aa5ea12b0c8c3511793a8d806db... | 3351 | 3208 | 6930 | 8188 | 460996648195 | 211933984 | 461621648195 | 2.122213e+08 | 59336634 | 27278.8301 | 34846.363 | 16.0198 | 14849.138 | 6.8265 | 21846.099858 | 625000000 | 287331.25 | 684336634 | 314610.0938 | F2Pool |
| 130 | 825201 | 00000000000000000003708536678656de241850f2c8d1... | 2024-01-10 23:26:20 | 2024-01-10 22:34:45 | 1619503 | 791389 | 3993670 | 805830656 | 30080000 | 110000000010000000000000000000 | 3331907f6e0c95df7a1878713fa7c5236bae800511b9bc... | 1406715446 | 386127977 | 73197634206448 | 000000000000000000000000000000000000000064faf2... | 0371970c194d696e656420627920416e74506f6f6c20b4... | 3595 | 3411 | 6507 | 12417 | 3922763169330 | 1803411968 | 3923388169330 | 1.803699e+09 | 97612928 | 44875.5898 | 60291.070 | 27.7176 | 24452.883 | 11.2418 | 218458.561485 | 625000000 | 287331.25 | 722612928 | 332206.8438 | AntPool |
| 131 | 825202 | 00000000000000000003ceb0e97b1b7de0c449546f7364... | 2024-01-10 23:34:19 | 2024-01-10 22:35:09 | 1671181 | 775555 | 3997846 | 536969216 | 20018000 | 100000000000011000000000000000 | ba5415ff7db9eafd6bbdf0adaf889a7f9a1e3794311965... | 3156739299 | 386127977 | 73197634206448 | 000000000000000000000000000000000000000064fb35... | 0372970c2cfabe6d6dcdf07697f67a6a92071ea4d08954... | 3820 | 3649 | 6601 | 11084 | 1233991147461 | 567302720 | 1234616147461 | 5.675901e+08 | 61210942 | 28140.5059 | 36638.188 | 16.8436 | 15318.138 | 7.0421 | 30406.603814 | 625000000 | 287331.25 | 686210942 | 315471.7500 | F2Pool |
| 132 | 825203 | 0000000000000000000357f2c0411a907b380bb3caae77... | 2024-01-10 23:51:21 | 2024-01-10 22:55:42 | 1695867 | 765649 | 3992814 | 562544640 | 2187c000 | 100001100001111100000000000000 | 9327da31731e160a0b624d6881cdcca1bc2bf22fb31896... | 3473839639 | 386127977 | 73197634206448 | 000000000000000000000000000000000000000064fb77... | 0373970c045a2d9f652f466f756e647279205553412050... | 3801 | 3642 | 6388 | 11549 | 3199955342134 | 1471115520 | 3200580342134 | 1.471403e+09 | 74880785 | 34424.9414 | 44162.600 | 20.3031 | 18758.960 | 8.6241 | 145833.700324 | 625000000 | 287331.25 | 699880785 | 321756.1875 | Foundry USA Pool |
133 rows × 36 columns
| Column | Description |
|---|---|
| block_id | Identifier of the block in which the transaction is included. |
| transaction_hash | Unique hash identifier of the transaction. |
| time | Timestamp of when the transaction was recorded in the blockchain. |
| size | Size of the transaction in bytes. |
| weight | Weight of the transaction, accounting for witness data in SegWit transactions. |
| version | Version number of the transaction format. |
| lock_time | The earliest time or block height when a transaction can be added to the blockchain. |
| is_coinbase | Binary indicator (True/False) for coinbase transactions. |
| has_witness | Binary indicator (True/False) if the transaction includes witness data (SegWit). |
| input_count | Number of inputs (UTXOs) in the transaction. |
| output_count | Number of outputs (destination addresses) in the transaction. |
| input_total | Total value of the inputs in satoshis. |
| input_total_usd | Total value of the inputs in USD. |
| output_total | Total value of the outputs in satoshis. |
| output_total_usd | Total value of the outputs in USD. |
| fee | Transaction fee paid by the sender in satoshis. |
| fee_usd | Transaction fee paid by the sender in USD. |
| fee_per_kb | Transaction fee rate per kilobyte of transaction size. |
| fee_per_kb_usd | Transaction fee rate per kilobyte in USD. |
| fee_per_kwu | Transaction fee rate per kiloweight unit (kwu). |
| fee_per_kwu_usd | Transaction fee rate per kiloweight unit in USD. |
| cdd_total | Cumulative Daily Difficulty (CDD) total. A measure related to historical network difficulty. |
tsv_transactions = "dump_bitcoin_10_01_2024/blockchair_bitcoin_transactions_20240110.tsv"
# Read the TSV file into a pandas DataFrame
df_transactions = pd.read_csv(tsv_transactions, sep='\t')
df_transactions = df_transactions.rename(columns={'hash': 'transaction_hash'})
# Display the DataFrame
print("#### Transaction dimension ####")
print(df_transactions.shape)
print("\n#### Transaction attributes ####")
print(df_transactions.columns.tolist())
df_transactions
#### Transaction dimension #### (383774, 22) #### Transaction attributes #### ['block_id', 'transaction_hash', 'time', 'size', 'weight', 'version', 'lock_time', 'is_coinbase', 'has_witness', 'input_count', 'output_count', 'input_total', 'input_total_usd', 'output_total', 'output_total_usd', 'fee', 'fee_usd', 'fee_per_kb', 'fee_per_kb_usd', 'fee_per_kwu', 'fee_per_kwu_usd', 'cdd_total']
| block_id | transaction_hash | time | size | weight | version | lock_time | is_coinbase | has_witness | input_count | output_count | input_total | input_total_usd | output_total | output_total_usd | fee | fee_usd | fee_per_kb | fee_per_kb_usd | fee_per_kwu | fee_per_kwu_usd | cdd_total | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 825071 | ef823e020330de717cc72f834cc3dc7f32baf4ea7e3e5b... | 2024-01-10 00:35:31 | 299 | 1088 | 1 | 0 | 1 | 1 | 1 | 3 | 0 | 0.000000e+00 | 737190208 | 3.389085e+05 | 0 | 0.0000 | 0.000 | 0.0000 | 0.00 | 0.0000 | 0.000000e+00 |
| 1 | 825071 | 3f325a9160b6d49b1a60bdeb56689838a7fd9c38f2af56... | 2024-01-10 00:35:31 | 225 | 573 | 1 | 0 | 0 | 1 | 1 | 2 | 465974173 | 2.142223e+05 | 465864157 | 2.141717e+05 | 110016 | 50.5777 | 488960.000 | 224.7896 | 192000.00 | 88.2682 | 9.743930e+00 |
| 2 | 825071 | 5c56ae6de10afbcc1786c3093e0fb92d4260d1cb9090ec... | 2024-01-10 00:35:31 | 225 | 900 | 1 | 0 | 0 | 0 | 1 | 2 | 8729714 | 4.013311e+03 | 8583260 | 3.945982e+03 | 146454 | 67.3293 | 650906.700 | 299.2415 | 162726.67 | 74.8105 | 1.429036e-01 |
| 3 | 825071 | ef3cde8f1ce0c98388b973f855aaee003e9265dfa09b38... | 2024-01-10 00:35:31 | 226 | 574 | 1 | 0 | 0 | 1 | 1 | 2 | 210000000 | 9.654330e+04 | 209928000 | 9.651020e+04 | 72000 | 33.1006 | 318584.060 | 146.4626 | 125435.54 | 57.6667 | 5.841597e-01 |
| 4 | 825071 | ef91085cbe36fb48b8391ada7af60e18b6e752eb4efc3a... | 2024-01-10 00:35:31 | 419 | 1025 | 1 | 0 | 0 | 1 | 2 | 2 | 17000493887 | 7.815637e+06 | 17000369903 | 7.815580e+06 | 123984 | 56.9992 | 295904.530 | 136.0364 | 120960.00 | 55.6089 | 1.437647e+01 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 383769 | 825203 | 6fdf800be6274f476ad89322818a5ba54ba9f9cd530792... | 2024-01-10 23:51:21 | 379 | 625 | 1 | 0 | 0 | 1 | 1 | 1 | 6969 | 3.203900e+00 | 294 | 1.352000e-01 | 6675 | 3.0687 | 17612.137 | 8.0968 | 10680.00 | 4.9099 | 8.243424e-07 |
| 383770 | 825203 | cf81f0ca6c1ec5639ff010dfe44a4fbe41708e0eb6ab80... | 2024-01-10 23:51:21 | 379 | 625 | 1 | 0 | 0 | 1 | 1 | 1 | 6969 | 3.203900e+00 | 294 | 1.352000e-01 | 6675 | 3.0687 | 17612.137 | 8.0968 | 10680.00 | 4.9099 | 8.243424e-07 |
| 383771 | 825203 | fb8907c6983b9569c2eb8532e58b2db83752799390e69e... | 2024-01-10 23:51:21 | 379 | 625 | 1 | 0 | 0 | 1 | 1 | 1 | 6969 | 3.203900e+00 | 294 | 1.352000e-01 | 6675 | 3.0687 | 17612.137 | 8.0968 | 10680.00 | 4.9099 | 8.243424e-07 |
| 383772 | 825203 | 37e8e17e238972b852f143c6c3473128d821da14159f3b... | 2024-01-10 23:51:21 | 379 | 625 | 1 | 0 | 0 | 1 | 1 | 1 | 6969 | 3.203900e+00 | 294 | 1.352000e-01 | 6675 | 3.0687 | 17612.137 | 8.0968 | 10680.00 | 4.9099 | 8.243424e-07 |
| 383773 | 825203 | b7311d24739cdf0a680f5c70ee09ea15e2134e3fc1a7ff... | 2024-01-10 23:51:21 | 379 | 625 | 1 | 0 | 0 | 1 | 1 | 1 | 6969 | 3.203900e+00 | 294 | 1.352000e-01 | 6675 | 3.0687 | 17612.137 | 8.0968 | 10680.00 | 4.9099 | 8.243424e-07 |
383774 rows × 22 columns
| Column | Description |
|---|---|
| address | Bitcoin address, a unique identifier on the blockchain. |
| balance | The balance of Bitcoin associated with the address, measured in satoshis. |
tsv_adresses = "dump_bitcoin_10_01_2024/blockchair_bitcoin_addresses_latest.tsv"
# Read the TSV file into a pandas DataFrame
df_adresses = pd.read_csv(tsv_adresses, sep='\t')
df_adresses = df_adresses.rename(columns={'address': 'hash_address'})
# Display the DataFrame
print("#### Adresses dimension ####")
print(df_adresses.shape)
print("\n#### Adresses attributes ####")
print(df_adresses.columns.tolist())
df_adresses
#### Adresses dimension #### (52598833, 2) #### Adresses attributes #### ['hash_address', 'balance']
| hash_address | balance | |
|---|---|---|
| 0 | 34xp4vRoCGJym3xR7yCVPFHoCNxv4Twseo | 24859746692320 |
| 1 | bc1qgdjqv0av3q56jvd82tkdjpy7gdp9ut8tlqmgrpmv24... | 20401007547456 |
| 2 | bc1ql49ydapnjafl5t2cp9zqpjwe6pdgmxy98859v2 | 12713620120811 |
| 3 | 39884E3j6KZj82FK4vcCrkUvWYL5MQaS3v | 11517738865340 |
| 4 | bc1qazcm763858nkj2dj986etajv6wquslv8uxwczt | 9464330762512 |
| ... | ... | ... |
| 52598828 | 1AyjGkoaiNir5PRMTvyTHSyCLguyEgYpLv | 1 |
| 52598829 | 16WZ9Fi8EDqyPCeLmEekWuPqxhk7t5K4fb | 1 |
| 52598830 | 1FaUaprkrAZnWZPCj5JZaK9R18W9XBzD81 | 1 |
| 52598831 | 1JNuAjoN9cj8zyHfJ45nfhMMEgkoVwLvkT | 1 |
| 52598832 | 13xUr3oovz6rd2w9NrpHEyZfr1BfNs3pum | 1 |
52598833 rows × 2 columns
csv_entity = "dump_bitcoin_10_01_2024/addresses_2016.csv"
# Read the TSV file into a pandas DataFrame
df_entity = pd.read_csv(csv_entity)
df_entity = df_entity.rename(columns={'address': 'hash_address'})
df_entity
| address_num | address_type | hash_address | cluster | data | addresses | clusters | entity_name | entity_type | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 20803569 | pubkeyhash | 1DrK44np3gMKuvcGeFVv9Jk67zodP52eMu | 107718037.0 | NaN | NaN | NaN | BitFury | mining |
| 1 | 47665646 | pubkeyhash | 1Ay1fs7ghRvURVcYMxEk1in9PMtrrMKoSR | 117741795.0 | NaN | NaN | NaN | NaN | NaN |
| 2 | 168251500 | pubkeyhash | 1CuCRqGMc1Vbax4nk5JTALepVAGR6eh2AW | 23159943.0 | NaN | NaN | NaN | NaN | NaN |
| 3 | 173405976 | pubkeyhash | 1HiWFALZS9WR6ZhptzDidpGEvDUdbqzGAi | 182789480.0 | NaN | NaN | NaN | NaN | NaN |
| 4 | 173432615 | pubkeyhash | 1LHoMBBnYboJgsfyn1Jk9PfjH274L96qgL | 182811195.0 | NaN | NaN | NaN | NaN | NaN |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 720647 | 173936534 | pubkeyhash | 1G7oWEsDbEwdr5WE1k9w1wkBuvKpu9quZ4 | 183177928.0 | NaN | NaN | NaN | NaN | NaN |
| 720648 | 173936535 | pubkeyhash | 1MoNs5uH4CF43XHH17NUCYYVw1Nv68BvFX | 155137012.0 | NaN | NaN | NaN | NaN | NaN |
| 720649 | 85082659 | pubkeyhash | 1EUgmNoALra2YSCGcqdpLCNnGYAFQeLEp9 | 132278807.0 | NaN | NaN | NaN | NaN | NaN |
| 720650 | 8373447 | multisig1of3 | NaN | NaN | NaN | ['1PexU18WvWPTcEFWpFkzRdm1qXWShoeBan', '12xeyA... | [183177929, 183177930, 132278807] | NaN | NaN |
| 720651 | 8373448 | multisig1of3 | NaN | NaN | NaN | ['1HgBEFp4SyFcBWtJChFrozPDLBhdhh7pS6', '1FL3PH... | [183177931, 183177932, 132278807] | NaN | NaN |
720652 rows × 9 columns
| Column | Description |
|---|---|
| block_id | Identifier of the block in which the transaction is included. |
| transaction_hash | Unique hash identifier of the transaction. |
| index | Index of the input in the transaction. |
| time | Timestamp of when the input transaction was recorded in the blockchain. |
| value | Monetary value of the input in satoshis. |
| value_usd | Monetary value of the input in USD. |
| recipient | Address number of the input recipient. |
| type | Type of the address (e.g., pubkey, pubkeyhash, scripthash, multisig). |
| script_hex | Hexadecimal representation of the input script. |
| is_from_coinbase | Binary indicator (True/False) if the input is from a coinbase transaction. |
| is_spendable | Binary indicator (True/False) if the input is spendable. |
| spending_block_id | Identifier of the block where the input is spent. |
| spending_transaction_hash | Unique hash identifier of the spending transaction. |
| spending_index | Index of the spent output in the spending transaction. |
| spending_time | Timestamp of when the spending transaction was recorded. |
| spending_value_usd | Monetary value of the spending input in USD. |
| spending_sequence | Sequence number of the spending transaction. |
| spending_signature_hex | Hexadecimal representation of the spending input signature. |
| spending_witness | Binary indicator (True/False) if the spending transaction uses a witness. |
| lifespan | Time difference (in seconds) between the spending and input transactions. |
| cdd | Cumulative Daily Difficulty (CDD) of the input transaction. |
tsv_inputs = "dump_bitcoin_10_01_2024/blockchair_bitcoin_inputs_20240110.tsv"
# Read the TSV file into a pandas DataFrame
df_inputs = pd.read_csv(tsv_inputs, sep='\t')
df_inputs = df_inputs.rename(columns={'recipient': 'hash_address'})
# Display the DataFrame
print("#### Intputs dimension ####")
print(df_inputs.shape)
print("\n#### Intputs attributes ####")
print(df_inputs.columns.tolist())
df_inputs = df_inputs.drop_duplicates(subset='transaction_hash', keep='first')
df_inputs
#### Intputs dimension #### (899940, 21) #### Intputs attributes #### ['block_id', 'transaction_hash', 'index', 'time', 'value', 'value_usd', 'hash_address', 'type', 'script_hex', 'is_from_coinbase', 'is_spendable', 'spending_block_id', 'spending_transaction_hash', 'spending_index', 'spending_time', 'spending_value_usd', 'spending_sequence', 'spending_signature_hex', 'spending_witness', 'lifespan', 'cdd']
| block_id | transaction_hash | index | time | value | value_usd | hash_address | type | script_hex | is_from_coinbase | is_spendable | spending_block_id | spending_transaction_hash | spending_index | spending_time | spending_value_usd | spending_sequence | spending_signature_hex | spending_witness | lifespan | cdd | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 824788 | 8d27bc23cd585a7181e967a626e34a2292a09b9cd9c387... | 0 | 2024-01-07 22:24:21 | 465974173 | 2.050612e+05 | bc1qykpk3q0m9ku2d9fc6q8hxn3ksk4z8a7g8da089 | witness_v0_scripthash | 001425836881fb2db8a69538d00f734e3685aa23f7c8 | 0 | -1 | 825071 | 3f325a9160b6d49b1a60bdeb56689838a7fd9c38f2af56... | 0 | 2024-01-10 00:35:31 | 2.142223e+05 | 4294967293 | NaN | 3044022015fcf7ec7036938864e49e5851b7335870f0ef... | 180670 | 9.743930 |
| 1 | 824840 | 680643bf66468a9766f8ab0fabeb460961131387f6cfd6... | 1 | 2024-01-08 09:18:16 | 8729714 | 3.832781e+03 | 1JwPSPmZqiAKKenSq3zuVexvrzEjwWzHg8 | pubkeyhash | 76a914c4c35494c994805b405ce3df19d7d2b1565c3f15... | 0 | -1 | 825071 | 5c56ae6de10afbcc1786c3093e0fb92d4260d1cb9090ec... | 0 | 2024-01-10 00:35:31 | 4.013311e+03 | 268435456 | 47304402201e7a2ae8e888bbefeb711979a7f6b51e9bf0... | NaN | 141435 | 0.142904 |
| 2 | 825030 | 4514db523968fdcec96d62f24e5a079ff9c4df70d56c32... | 0 | 2024-01-09 17:54:57 | 210000000 | 9.892890e+04 | bc1qgvrek7mjkddrcq3w2t3xc06ekqdaqs23fw4gav | witness_v0_scripthash | 001443079b7b72b35a3c022e52e26c3f59b01bd04151 | 0 | -1 | 825071 | ef3cde8f1ce0c98388b973f855aaee003e9265dfa09b38... | 0 | 2024-01-10 00:35:31 | 9.654330e+04 | 4294967293 | NaN | 3045022100860693bdce15907426b7e309ae08f765be8a... | 24034 | 0.584160 |
| 3 | 825061 | d037da05e086ee5375e273cbb4e5cb26db68b289934353... | 0 | 2024-01-09 22:34:03 | 17000000000 | 8.008530e+06 | 3LhVNi7CGDKLmEfnNGaJG2KnF1n3Pvf5r4 | scripthash | a914d0814e10ddbe18fd27941598851b62ec073b4edd87 | 0 | -1 | 825071 | ef91085cbe36fb48b8391ada7af60e18b6e752eb4efc3a... | 0 | 2024-01-10 00:35:31 | 7.815410e+06 | 4294967295 | 160014ad74d075c9e0e2daef1bc1ac0a1620618917d6aa | 30440220072ec26fbc0c3e01f4d862b16dfbc442976e2d... | 7288 | 14.339815 |
| 4 | 824019 | b7d6de5f867d4a00d09a9cf82902d4b0f0da2d750752c1... | 1 | 2024-01-02 14:28:12 | 493887 | 2.205008e+02 | 34ZbJuSvQaCB8Ms4AFMEgvegg58yEpXdnW | scripthash | a9141f80eb22f83b9515637c37ca4c0e91a9a6e17b1287 | 0 | -1 | 825071 | ef91085cbe36fb48b8391ada7af60e18b6e752eb4efc3a... | 1 | 2024-01-10 00:35:31 | 2.270547e+02 | 4294967295 | 160014308cc437d0807917e6083ce12ed5e0f8f7d7661f | 3045022100faf6c021a03ec28271e1f662cafcf7935667... | 641239 | 0.036655 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 899771 | 825202 | 9610e6cedece37f246afcbfb9d9d4ce51717a0d3f5b093... | 2 | 2024-01-10 23:34:19 | 16972653 | 7.802838e+03 | bc1pfp764ez6ssxcnyfg62qy50eu32u9h6d920ls457aj0... | witness_v1_taproot | 5120487daae45a840d899128d2804a3f3c8ab85be9a553... | 0 | -1 | 825203 | d1236a945709814363b79497fbe0ce6eb3dd26a12f81ae... | 0 | 2024-01-10 23:51:21 | 7.802838e+03 | 4294967295 | NaN | dc1721a4ce4e7dd5ed0384f1659c6d68f912a7b193188c... | 1022 | 0.002008 |
| 899772 | 825203 | d1236a945709814363b79497fbe0ce6eb3dd26a12f81ae... | 1 | 2024-01-10 23:51:21 | 16907478 | 7.772875e+03 | bc1pfp764ez6ssxcnyfg62qy50eu32u9h6d920ls457aj0... | witness_v1_taproot | 5120487daae45a840d899128d2804a3f3c8ab85be9a553... | 0 | -1 | 825203 | 0c1ee15673e7cbb83ef9046fe03bbe6fbcae4b6b338d41... | 0 | 2024-01-10 23:51:21 | 7.772875e+03 | 4294967295 | NaN | 31caa0c6583b0b8cdb7c3be1a1410304e8a927a9e64e4d... | 0 | 0.000000 |
| 899774 | 825203 | c24abf0fca46bd5a6ab65868de80bf3fc833af6bd57838... | 1 | 2024-01-10 23:51:21 | 44508 | 2.046170e+01 | bc1qg6ay59mm4lh6g27qfcpqshtqrpu8mffq5fp6tc | witness_v0_scripthash | 001446ba4a177bafefa42bc04e02085d6018787da520 | 0 | -1 | 825203 | 077fa44271cc81698acef7321c3011cae34c030f31045e... | 0 | 2024-01-10 23:51:21 | 2.046170e+01 | 4294967295 | NaN | 3045022100e0b17569f046512218b7354b996193af67a1... | 0 | 0.000000 |
| 899775 | 825200 | 68da72d8daaef9541adfd4a212cd72ecaee6a541c90f39... | 0 | 2024-01-10 23:03:46 | 316553914 | 1.455293e+05 | bc1q7084tx0ztvxq3mfnx8nwpa4hpwuxf8zxrj3tcvyn6z... | witness_v0_keyhash | 0020f3cf5599e25b0c08ed3331e6e0f6b70bb8649c461c... | 0 | -1 | 825203 | 8ea76e6b6ac29c91c9947f737a92d84c34e6e80dd3f091... | 0 | 2024-01-10 23:51:21 | 1.455293e+05 | 4294967295 | NaN | 3044022079009aec6f11fdedeb1748577c0690ea6977d6... | 2855 | 0.104602 |
| 899778 | 825203 | b9f0148dd7f89ce61d3012b1e3debbd743daed52af19c5... | 0 | 2024-01-10 23:51:21 | 546 | 2.510000e-01 | bc1q6paugjafz08l60guyu8r6ut743va7aju0qj6nh | witness_v0_scripthash | 0014d07bc44ba913cffd3d1c270e3d717eac59df765c | 0 | -1 | 825203 | f8fea9b38b16f76e6ac5a5ecb50a6cd4911689d4a4b0b4... | 0 | 2024-01-10 23:51:21 | 2.510000e-01 | 4294967295 | NaN | 3045022100da09241a0846475b3fea18c28881609f66ec... | 0 | 0.000000 |
537886 rows × 21 columns
| Column | Description |
|---|---|
| block_id | Identifier of the block in which the transaction is included. |
| transaction_hash | Unique hash identifier of the transaction. |
| index | Index of the output in the transaction. |
| time | Timestamp of when the output transaction was recorded in the blockchain. |
| value | Monetary value of the output in satoshis. |
| value_usd | Monetary value of the output in USD. |
| recipient | Address number of the output recipient. |
| type | Type of the address (e.g., pubkey, pubkeyhash, scripthash, multisig). |
| script_hex | Hexadecimal representation of the output script. |
| is_from_coinbase | Binary indicator (True/False) if the output is from a coinbase transaction. |
| is_spendable | Binary indicator (True/False) if the output is spendable. |
tsv_outputs = "dump_bitcoin_10_01_2024/blockchair_bitcoin_outputs_20240110.tsv"
# Read the TSV file into a pandas DataFrame
df_outputs = pd.read_csv(tsv_outputs, sep='\t')
df_outputs = df_outputs.rename(columns={'hash': 'transaction_hash', 'recipient': 'hash_address'})
# Display the DataFrame
print("#### Outputs dimension ####")
print(df_outputs.shape)
print("\n#### Outputs attributes ####")
print(df_outputs.columns.tolist())
df_outputs = df_outputs.drop_duplicates(subset='transaction_hash', keep='first')
df_outputs
#### Outputs dimension #### (1295729, 11) #### Outputs attributes #### ['block_id', 'transaction_hash', 'index', 'time', 'value', 'value_usd', 'hash_address', 'type', 'script_hex', 'is_from_coinbase', 'is_spendable']
| block_id | transaction_hash | index | time | value | value_usd | hash_address | type | script_hex | is_from_coinbase | is_spendable | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 825071 | ef823e020330de717cc72f834cc3dc7f32baf4ea7e3e5b... | 0 | 2024-01-10 00:35:31 | 737190208 | 3.389085e+05 | 3L8Ck6bm3sve1vJGKo6Ht2k167YKSKi8TZ | scripthash | a914ca35b1f4d02907314852f09935b9604507f8d70087 | 1 | -1 |
| 5 | 825071 | 3f325a9160b6d49b1a60bdeb56689838a7fd9c38f2af56... | 0 | 2024-01-10 00:35:31 | 200000000 | 9.194600e+04 | 1PbpDWfJLyLC6NAmNHgCaMGV3EFfCm6imi | pubkeyhash | 76a914f7e852d9db839c950866cc1aba405eaeef61e939... | 0 | -1 |
| 9 | 825071 | 5c56ae6de10afbcc1786c3093e0fb92d4260d1cb9090ec... | 0 | 2024-01-10 00:35:31 | 5927900 | 2.725233e+03 | 1KsYPRDveiGwHPn8mCrPKyaeRNVmCJ3Bdk | pubkeyhash | 76a914cf0143288aca690d76de6107fd3d9966c282af0e... | 0 | -1 |
| 11 | 825071 | ef3cde8f1ce0c98388b973f855aaee003e9265dfa09b38... | 0 | 2024-01-10 00:35:31 | 40000000 | 1.838920e+04 | 15zrhjZ5N7ThsohJjM3Jp5xtaM1QYNUT6M | pubkeyhash | 76a91436d1b3cd51bd1a4cd4f04479698bde41b13952f1... | 0 | -1 |
| 15 | 825071 | ef91085cbe36fb48b8391ada7af60e18b6e752eb4efc3a... | 0 | 2024-01-10 00:35:31 | 17000000000 | 7.815410e+06 | 3K5oE7ew2BPG5tsAB6wt6fFfzQRoGg1yHy | scripthash | a914bec91b4309e151ef594a9bfd490c3bcd736a6e0487 | 0 | -1 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 1295724 | 825203 | 6fdf800be6274f476ad89322818a5ba54ba9f9cd530792... | 0 | 2024-01-10 23:51:21 | 294 | 1.352000e-01 | bc1qntwyll7kkrsn4xtqvyvnwpx5hdkxzfla5muhh7 | witness_v0_scripthash | 00149adc4fffd6b0e13a996061193704d4bb6c6127fd | 0 | -1 |
| 1295725 | 825203 | cf81f0ca6c1ec5639ff010dfe44a4fbe41708e0eb6ab80... | 0 | 2024-01-10 23:51:21 | 294 | 1.352000e-01 | bc1qntwyll7kkrsn4xtqvyvnwpx5hdkxzfla5muhh7 | witness_v0_scripthash | 00149adc4fffd6b0e13a996061193704d4bb6c6127fd | 0 | -1 |
| 1295726 | 825203 | fb8907c6983b9569c2eb8532e58b2db83752799390e69e... | 0 | 2024-01-10 23:51:21 | 294 | 1.352000e-01 | bc1qntwyll7kkrsn4xtqvyvnwpx5hdkxzfla5muhh7 | witness_v0_scripthash | 00149adc4fffd6b0e13a996061193704d4bb6c6127fd | 0 | -1 |
| 1295727 | 825203 | 37e8e17e238972b852f143c6c3473128d821da14159f3b... | 0 | 2024-01-10 23:51:21 | 294 | 1.352000e-01 | bc1qntwyll7kkrsn4xtqvyvnwpx5hdkxzfla5muhh7 | witness_v0_scripthash | 00149adc4fffd6b0e13a996061193704d4bb6c6127fd | 0 | -1 |
| 1295728 | 825203 | b7311d24739cdf0a680f5c70ee09ea15e2134e3fc1a7ff... | 0 | 2024-01-10 23:51:21 | 294 | 1.352000e-01 | bc1qntwyll7kkrsn4xtqvyvnwpx5hdkxzfla5muhh7 | witness_v0_scripthash | 00149adc4fffd6b0e13a996061193704d4bb6c6127fd | 0 | -1 |
383774 rows × 11 columns
df_transactions = df_transactions[["transaction_hash", "time", "size", "weight", "input_count", "output_count",
"input_total", "input_total_usd", "output_total", "output_total_usd", "fee",
"fee_usd", "is_coinbase"]]
df_entity = df_entity[["address_num", "address_type", "hash_address", "entity_name", "entity_type"]]
df_inputs = df_inputs[["transaction_hash", "index", "hash_address", "value", "value_usd", "time", "spending_transaction_hash", "spending_time", "spending_value_usd"]]
df_inputs = df_inputs.rename(columns={
'index': 'index_input',
'hash_address': 'hash_address_input',
'value': 'value_input',
'value_usd': 'value_usd_input',
'time': 'time_input'
})
df_outputs = df_outputs[["transaction_hash", "index", "hash_address", "value", "value_usd", "time"]]
df_outputs = df_outputs.rename(columns={
'index': 'index_output',
'hash_address': 'hash_address_output',
'value': 'value_output',
'value_usd': 'value_usd_output',
'time': 'time_output'
})
edges = pd.merge(df_inputs, df_outputs, on='transaction_hash', how='inner')
print("=== Columns ===")
print(edges.columns)
print("\n\n=== Dimensions ===")
print(edges.shape)
print("\n\n=== Sample ===")
edges.head()
=== Columns ===
Index(['transaction_hash', 'index_input', 'hash_address_input', 'value_input',
'value_usd_input', 'time_input', 'spending_transaction_hash',
'spending_time', 'spending_value_usd', 'index_output',
'hash_address_output', 'value_output', 'value_usd_output',
'time_output'],
dtype='object')
=== Dimensions ===
(181958, 14)
=== Sample ===
| transaction_hash | index_input | hash_address_input | value_input | value_usd_input | time_input | spending_transaction_hash | spending_time | spending_value_usd | index_output | hash_address_output | value_output | value_usd_output | time_output | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 19164e615c48161141ff5c750c08c6a21202446b16720a... | 0 | bc1pu9uysdxzq99fzat8myautah7qnahec64dm3smpc95l... | 47076 | 21.6423 | 2024-01-10 00:35:31 | b3b8d806902bbbb45371a16344c004c77a3d09f280f92d... | 2024-01-10 00:35:31 | 21.6423 | 0 | bc1pu9uysdxzq99fzat8myautah7qnahec64dm3smpc95l... | 47076 | 21.6423 | 2024-01-10 00:35:31 |
| 1 | 93f96b0e895bce583c0762d714e5d6cfb0eb2df5f4c8e7... | 0 | 14dJRoKyj2i83uRbTUeKqhFMwvFZcpiXyn | 768666170 | 353378.9062 | 2024-01-10 00:35:31 | 7525607e5547dafaf647d8a0f4073ba873c0b3e614cf77... | 2024-01-10 00:35:31 | 353378.9062 | 0 | 14dJRoKyj2i83uRbTUeKqhFMwvFZcpiXyn | 768666170 | 353378.9062 | 2024-01-10 00:35:31 |
| 2 | b3b8d806902bbbb45371a16344c004c77a3d09f280f92d... | 0 | bc1qwfn4fq5l86snpddxa7wy9jh6dc86vd6jdquw2g | 546 | 0.2510 | 2024-01-10 00:35:31 | 2a3b7681cc8e2828f6d167f94784c16a01d063e32aae64... | 2024-01-10 00:35:31 | 0.2510 | 0 | bc1qwfn4fq5l86snpddxa7wy9jh6dc86vd6jdquw2g | 546 | 0.2510 | 2024-01-10 00:35:31 |
| 3 | 6e2d986e8a97c21b51e5f5c8d03a6e35c5d575aff5e8cc... | 0 | bc1pk8fr0lkcfp7csz04hqn4alqy332h0nksmqmjjg9hye... | 42786 | 19.6700 | 2024-01-10 00:35:31 | f35ad09853a3707e9146fc30eb049142f8b66fd677bf97... | 2024-01-10 00:35:31 | 19.6700 | 0 | bc1pk8fr0lkcfp7csz04hqn4alqy332h0nksmqmjjg9hye... | 42786 | 19.6700 | 2024-01-10 00:35:31 |
| 4 | f35ad09853a3707e9146fc30eb049142f8b66fd677bf97... | 0 | 1GrwDkr33gT6LuumniYjKEGjTLhsL5kmqC | 546 | 0.2510 | 2024-01-10 00:35:31 | 1a7622dab7b996900b4ba60dd598aa29d75dcfd244924b... | 2024-01-10 00:35:31 | 0.2510 | 0 | 1GrwDkr33gT6LuumniYjKEGjTLhsL5kmqC | 546 | 0.2510 | 2024-01-10 00:35:31 |
edges = edges[edges['hash_address_input'] == edges['hash_address_output']]
edges = edges[['transaction_hash', 'spending_transaction_hash', 'time_input', 'time_output',
'hash_address_input', 'hash_address_output',
'value_input', 'value_output', 'value_usd_input', 'value_usd_output']]
edges
| transaction_hash | spending_transaction_hash | time_input | time_output | hash_address_input | hash_address_output | value_input | value_output | value_usd_input | value_usd_output | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 19164e615c48161141ff5c750c08c6a21202446b16720a... | b3b8d806902bbbb45371a16344c004c77a3d09f280f92d... | 2024-01-10 00:35:31 | 2024-01-10 00:35:31 | bc1pu9uysdxzq99fzat8myautah7qnahec64dm3smpc95l... | bc1pu9uysdxzq99fzat8myautah7qnahec64dm3smpc95l... | 47076 | 47076 | 21.6423 | 21.6423 |
| 1 | 93f96b0e895bce583c0762d714e5d6cfb0eb2df5f4c8e7... | 7525607e5547dafaf647d8a0f4073ba873c0b3e614cf77... | 2024-01-10 00:35:31 | 2024-01-10 00:35:31 | 14dJRoKyj2i83uRbTUeKqhFMwvFZcpiXyn | 14dJRoKyj2i83uRbTUeKqhFMwvFZcpiXyn | 768666170 | 768666170 | 353378.9062 | 353378.9062 |
| 2 | b3b8d806902bbbb45371a16344c004c77a3d09f280f92d... | 2a3b7681cc8e2828f6d167f94784c16a01d063e32aae64... | 2024-01-10 00:35:31 | 2024-01-10 00:35:31 | bc1qwfn4fq5l86snpddxa7wy9jh6dc86vd6jdquw2g | bc1qwfn4fq5l86snpddxa7wy9jh6dc86vd6jdquw2g | 546 | 546 | 0.2510 | 0.2510 |
| 3 | 6e2d986e8a97c21b51e5f5c8d03a6e35c5d575aff5e8cc... | f35ad09853a3707e9146fc30eb049142f8b66fd677bf97... | 2024-01-10 00:35:31 | 2024-01-10 00:35:31 | bc1pk8fr0lkcfp7csz04hqn4alqy332h0nksmqmjjg9hye... | bc1pk8fr0lkcfp7csz04hqn4alqy332h0nksmqmjjg9hye... | 42786 | 42786 | 19.6700 | 19.6700 |
| 4 | f35ad09853a3707e9146fc30eb049142f8b66fd677bf97... | 1a7622dab7b996900b4ba60dd598aa29d75dcfd244924b... | 2024-01-10 00:35:31 | 2024-01-10 00:35:31 | 1GrwDkr33gT6LuumniYjKEGjTLhsL5kmqC | 1GrwDkr33gT6LuumniYjKEGjTLhsL5kmqC | 546 | 546 | 0.2510 | 0.2510 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 181947 | 342750fe8b5fd681d07103f377528b1528415c467b6a3a... | 26d078c6cc1cb51e207663c4f6fa0ec45c3f3a1a7228a8... | 2024-01-10 23:51:21 | 2024-01-10 23:51:21 | bc1pxx9qcafmhg0er6w92ffcstfwksymjy7tmvwwzqh54m... | bc1pxx9qcafmhg0er6w92ffcstfwksymjy7tmvwwzqh54m... | 35277 | 35277 | 16.2179 | 16.2179 |
| 181948 | 806a43c458e6c6fb407452fcce180091ad1daa80237760... | 6f6fb148dad5038fdd8a03dd35b45fd13aa1ca5f93a378... | 2024-01-10 23:51:21 | 2024-01-10 23:51:21 | bc1plz9ytx2a24va8zu59shg8unv6n86anvqd9kvcp6vcf... | bc1plz9ytx2a24va8zu59shg8unv6n86anvqd9kvcp6vcf... | 11665 | 11665 | 5.3628 | 5.3628 |
| 181952 | 35b2fd56400a31ef30263e39083c5dc387addb20ff42c7... | 83d2ff592f8f7064185cdd7168c63e6fb89bb3d9ab97bd... | 2024-01-10 23:34:19 | 2024-01-10 23:34:19 | bc1qzcgex3h3sqmzj2qjztk5k9lhd8m3ts4r27lu02 | bc1qzcgex3h3sqmzj2qjztk5k9lhd8m3ts4r27lu02 | 286065 | 286065 | 131.5127 | 131.5127 |
| 181956 | 68da72d8daaef9541adfd4a212cd72ecaee6a541c90f39... | 8ea76e6b6ac29c91c9947f737a92d84c34e6e80dd3f091... | 2024-01-10 23:03:46 | 2024-01-10 23:03:46 | bc1q7084tx0ztvxq3mfnx8nwpa4hpwuxf8zxrj3tcvyn6z... | bc1q7084tx0ztvxq3mfnx8nwpa4hpwuxf8zxrj3tcvyn6z... | 316553914 | 316553914 | 145529.3281 | 145529.3281 |
| 181957 | b9f0148dd7f89ce61d3012b1e3debbd743daed52af19c5... | f8fea9b38b16f76e6ac5a5ecb50a6cd4911689d4a4b0b4... | 2024-01-10 23:51:21 | 2024-01-10 23:51:21 | bc1q6paugjafz08l60guyu8r6ut743va7aju0qj6nh | bc1q6paugjafz08l60guyu8r6ut743va7aju0qj6nh | 546 | 546 | 0.2510 | 0.2510 |
105532 rows × 10 columns
import os
import pandas as pd
import numpy as np
### Ondine path ###
# path = os.chdir('/Users/ondineji/Library/Mobile Documents/com~apple~CloudDocs/Documents/ACADEMIQUE/M2 FTD/S2/Network Analysis in Python /projet/data/')
### Hugo path ###
path = os.chdir('C:/Users/humic/OneDrive/Documents/Ecole/SorbonneFTD/Cours/Data_Mining_Network/project/data/')
pd.set_option('display.max_columns', None)
import warnings
warnings.filterwarnings("ignore")
# import data viz libs
import matplotlib.pyplot as plt
from matplotlib import cm
# Import graph libs
import networkx as nx
from nxviz import CircosPlot, ArcPlot
from cdlib import algorithms
# import custom functions
from utils import *
Edge = an input / an output (attributes: address/entity, amount)
edges_df = pd.read_csv("edges.csv")
edges_df = edges_df[['transaction_hash', 'spending_transaction_hash', 'time_input', 'hash_address_input', 'value_input', 'value_usd_input']]
edges_df
| transaction_hash | spending_transaction_hash | time_input | hash_address_input | value_input | value_usd_input | |
|---|---|---|---|---|---|---|
| 0 | 19164e615c48161141ff5c750c08c6a21202446b16720a... | b3b8d806902bbbb45371a16344c004c77a3d09f280f92d... | 2024-01-10 00:35:31 | bc1pu9uysdxzq99fzat8myautah7qnahec64dm3smpc95l... | 47076 | 21.6423 |
| 1 | 93f96b0e895bce583c0762d714e5d6cfb0eb2df5f4c8e7... | 7525607e5547dafaf647d8a0f4073ba873c0b3e614cf77... | 2024-01-10 00:35:31 | 14dJRoKyj2i83uRbTUeKqhFMwvFZcpiXyn | 768666170 | 353378.9062 |
| 2 | b3b8d806902bbbb45371a16344c004c77a3d09f280f92d... | 2a3b7681cc8e2828f6d167f94784c16a01d063e32aae64... | 2024-01-10 00:35:31 | bc1qwfn4fq5l86snpddxa7wy9jh6dc86vd6jdquw2g | 546 | 0.2510 |
| 3 | 6e2d986e8a97c21b51e5f5c8d03a6e35c5d575aff5e8cc... | f35ad09853a3707e9146fc30eb049142f8b66fd677bf97... | 2024-01-10 00:35:31 | bc1pk8fr0lkcfp7csz04hqn4alqy332h0nksmqmjjg9hye... | 42786 | 19.6700 |
| 4 | f35ad09853a3707e9146fc30eb049142f8b66fd677bf97... | 1a7622dab7b996900b4ba60dd598aa29d75dcfd244924b... | 2024-01-10 00:35:31 | 1GrwDkr33gT6LuumniYjKEGjTLhsL5kmqC | 546 | 0.2510 |
| ... | ... | ... | ... | ... | ... | ... |
| 105527 | 342750fe8b5fd681d07103f377528b1528415c467b6a3a... | 26d078c6cc1cb51e207663c4f6fa0ec45c3f3a1a7228a8... | 2024-01-10 23:51:21 | bc1pxx9qcafmhg0er6w92ffcstfwksymjy7tmvwwzqh54m... | 35277 | 16.2179 |
| 105528 | 806a43c458e6c6fb407452fcce180091ad1daa80237760... | 6f6fb148dad5038fdd8a03dd35b45fd13aa1ca5f93a378... | 2024-01-10 23:51:21 | bc1plz9ytx2a24va8zu59shg8unv6n86anvqd9kvcp6vcf... | 11665 | 5.3628 |
| 105529 | 35b2fd56400a31ef30263e39083c5dc387addb20ff42c7... | 83d2ff592f8f7064185cdd7168c63e6fb89bb3d9ab97bd... | 2024-01-10 23:34:19 | bc1qzcgex3h3sqmzj2qjztk5k9lhd8m3ts4r27lu02 | 286065 | 131.5127 |
| 105530 | 68da72d8daaef9541adfd4a212cd72ecaee6a541c90f39... | 8ea76e6b6ac29c91c9947f737a92d84c34e6e80dd3f091... | 2024-01-10 23:03:46 | bc1q7084tx0ztvxq3mfnx8nwpa4hpwuxf8zxrj3tcvyn6z... | 316553914 | 145529.3281 |
| 105531 | b9f0148dd7f89ce61d3012b1e3debbd743daed52af19c5... | f8fea9b38b16f76e6ac5a5ecb50a6cd4911689d4a4b0b4... | 2024-01-10 23:51:21 | bc1q6paugjafz08l60guyu8r6ut743va7aju0qj6nh | 546 | 0.2510 |
105532 rows × 6 columns
edges = edges_df.rename(columns={
'transaction_hash': 'from_transaction',
'spending_transaction_hash': 'to_transaction',
'time_input': 'time',
'hash_address_input': 'hash_address',
'value_input': 'value',
'value_usd_input': 'value_usd'
})
edges
| from_transaction | to_transaction | time | hash_address | value | value_usd | |
|---|---|---|---|---|---|---|
| 0 | 19164e615c48161141ff5c750c08c6a21202446b16720a... | b3b8d806902bbbb45371a16344c004c77a3d09f280f92d... | 2024-01-10 00:35:31 | bc1pu9uysdxzq99fzat8myautah7qnahec64dm3smpc95l... | 47076 | 21.6423 |
| 1 | 93f96b0e895bce583c0762d714e5d6cfb0eb2df5f4c8e7... | 7525607e5547dafaf647d8a0f4073ba873c0b3e614cf77... | 2024-01-10 00:35:31 | 14dJRoKyj2i83uRbTUeKqhFMwvFZcpiXyn | 768666170 | 353378.9062 |
| 2 | b3b8d806902bbbb45371a16344c004c77a3d09f280f92d... | 2a3b7681cc8e2828f6d167f94784c16a01d063e32aae64... | 2024-01-10 00:35:31 | bc1qwfn4fq5l86snpddxa7wy9jh6dc86vd6jdquw2g | 546 | 0.2510 |
| 3 | 6e2d986e8a97c21b51e5f5c8d03a6e35c5d575aff5e8cc... | f35ad09853a3707e9146fc30eb049142f8b66fd677bf97... | 2024-01-10 00:35:31 | bc1pk8fr0lkcfp7csz04hqn4alqy332h0nksmqmjjg9hye... | 42786 | 19.6700 |
| 4 | f35ad09853a3707e9146fc30eb049142f8b66fd677bf97... | 1a7622dab7b996900b4ba60dd598aa29d75dcfd244924b... | 2024-01-10 00:35:31 | 1GrwDkr33gT6LuumniYjKEGjTLhsL5kmqC | 546 | 0.2510 |
| ... | ... | ... | ... | ... | ... | ... |
| 105527 | 342750fe8b5fd681d07103f377528b1528415c467b6a3a... | 26d078c6cc1cb51e207663c4f6fa0ec45c3f3a1a7228a8... | 2024-01-10 23:51:21 | bc1pxx9qcafmhg0er6w92ffcstfwksymjy7tmvwwzqh54m... | 35277 | 16.2179 |
| 105528 | 806a43c458e6c6fb407452fcce180091ad1daa80237760... | 6f6fb148dad5038fdd8a03dd35b45fd13aa1ca5f93a378... | 2024-01-10 23:51:21 | bc1plz9ytx2a24va8zu59shg8unv6n86anvqd9kvcp6vcf... | 11665 | 5.3628 |
| 105529 | 35b2fd56400a31ef30263e39083c5dc387addb20ff42c7... | 83d2ff592f8f7064185cdd7168c63e6fb89bb3d9ab97bd... | 2024-01-10 23:34:19 | bc1qzcgex3h3sqmzj2qjztk5k9lhd8m3ts4r27lu02 | 286065 | 131.5127 |
| 105530 | 68da72d8daaef9541adfd4a212cd72ecaee6a541c90f39... | 8ea76e6b6ac29c91c9947f737a92d84c34e6e80dd3f091... | 2024-01-10 23:03:46 | bc1q7084tx0ztvxq3mfnx8nwpa4hpwuxf8zxrj3tcvyn6z... | 316553914 | 145529.3281 |
| 105531 | b9f0148dd7f89ce61d3012b1e3debbd743daed52af19c5... | f8fea9b38b16f76e6ac5a5ecb50a6cd4911689d4a4b0b4... | 2024-01-10 23:51:21 | bc1q6paugjafz08l60guyu8r6ut743va7aju0qj6nh | 546 | 0.2510 |
105532 rows × 6 columns
hash_address with the entity name¶For reminder we imported previously a dataset which contain the entity name associated to hash_adress from 2016.
df_entity dataset¶# merged_df = pd.merge(edges, df_entity[['hash_address', 'entity_name', 'entity_type', 'address_type']], on='hash_address', how='left')
# merged_df["entity_name"].value_counts()
entity_name ePay.info 1 Name: count, dtype: int64
Unfortunately, we were unable to match all the hash addresses with the hash addresses for which we know the name of the entity. Indeed, the only entity we have managed to match is ebay So, for lack of data, we abandon the idea of being able to match the addresses of our graph with the names of entities.
hash_addressvaluevalue_usd# Add edge list from pandas dataframe
G = nx.from_pandas_edgelist(edges, 'from_transaction', 'to_transaction', ['hash_address', 'value', 'value_usd'],
create_using=nx.MultiDiGraph)
nx.is_directed_acyclic_graph(G)
True
The output is true means that our network is acyclic. That's a good point because a transaction bitcoin network is directed acyclic graph because we cannot, it is not possible to make a transaction to oneself (i.e identical source address and target adress for a given transaction)
G["19164e615c48161141ff5c750c08c6a21202446b16720a4f27541fa32cd2c695"]["b3b8d806902bbbb45371a16344c004c77a3d09f280f92de83f341720aa537b7f"]
AtlasView({0: {'hash_address': 'bc1pu9uysdxzq99fzat8myautah7qnahec64dm3smpc95l09eypr5nss4du7v4', 'value': 47076, 'value_usd': 21.6423}})
Node = a transaction with the following attributes:
hash, timestamp, intint, int,int, int, float, int,float, int, float, boolnodes_df = pd.read_csv("nodes.csv")
nodes_df = nodes_df[['transaction_hash', 'time', 'size', 'weight', 'input_count',
'output_count', 'input_total', 'input_total_usd', 'output_total',
'output_total_usd', 'fee', 'fee_usd', 'is_coinbase']]
nodes_df
| transaction_hash | time | size | weight | input_count | output_count | input_total | input_total_usd | output_total | output_total_usd | fee | fee_usd | is_coinbase | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | ef823e020330de717cc72f834cc3dc7f32baf4ea7e3e5b... | 2024-01-10 00:35:31 | 299 | 1088 | 1 | 3 | 0 | 0.000000e+00 | 737190208 | 3.389085e+05 | 0 | 0.0000 | 1 |
| 1 | 3f325a9160b6d49b1a60bdeb56689838a7fd9c38f2af56... | 2024-01-10 00:35:31 | 225 | 573 | 1 | 2 | 465974173 | 2.142223e+05 | 465864157 | 2.141717e+05 | 110016 | 50.5777 | 0 |
| 2 | 5c56ae6de10afbcc1786c3093e0fb92d4260d1cb9090ec... | 2024-01-10 00:35:31 | 225 | 900 | 1 | 2 | 8729714 | 4.013311e+03 | 8583260 | 3.945982e+03 | 146454 | 67.3293 | 0 |
| 3 | ef3cde8f1ce0c98388b973f855aaee003e9265dfa09b38... | 2024-01-10 00:35:31 | 226 | 574 | 1 | 2 | 210000000 | 9.654330e+04 | 209928000 | 9.651020e+04 | 72000 | 33.1006 | 0 |
| 4 | ef91085cbe36fb48b8391ada7af60e18b6e752eb4efc3a... | 2024-01-10 00:35:31 | 419 | 1025 | 2 | 2 | 17000493887 | 7.815637e+06 | 17000369903 | 7.815580e+06 | 123984 | 56.9992 | 0 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 383769 | 6fdf800be6274f476ad89322818a5ba54ba9f9cd530792... | 2024-01-10 23:51:21 | 379 | 625 | 1 | 1 | 6969 | 3.203900e+00 | 294 | 1.352000e-01 | 6675 | 3.0687 | 0 |
| 383770 | cf81f0ca6c1ec5639ff010dfe44a4fbe41708e0eb6ab80... | 2024-01-10 23:51:21 | 379 | 625 | 1 | 1 | 6969 | 3.203900e+00 | 294 | 1.352000e-01 | 6675 | 3.0687 | 0 |
| 383771 | fb8907c6983b9569c2eb8532e58b2db83752799390e69e... | 2024-01-10 23:51:21 | 379 | 625 | 1 | 1 | 6969 | 3.203900e+00 | 294 | 1.352000e-01 | 6675 | 3.0687 | 0 |
| 383772 | 37e8e17e238972b852f143c6c3473128d821da14159f3b... | 2024-01-10 23:51:21 | 379 | 625 | 1 | 1 | 6969 | 3.203900e+00 | 294 | 1.352000e-01 | 6675 | 3.0687 | 0 |
| 383773 | b7311d24739cdf0a680f5c70ee09ea15e2134e3fc1a7ff... | 2024-01-10 23:51:21 | 379 | 625 | 1 | 1 | 6969 | 3.203900e+00 | 294 | 1.352000e-01 | 6675 | 3.0687 | 0 |
383774 rows × 13 columns
# Add node attributes
nodes = nodes_df.set_index('transaction_hash').to_dict(orient='index')
for transaction_hash, data in nodes.items():
G.add_node(transaction_hash, time=data['time'], size=data['size'], weight=data['weight'],
input_count=data['input_count'], output_count=data['output_count'],
input_total=data['input_total'], input_total_usd=data['input_total_usd'],
output_total=data['output_total'], output_total_usd=data['output_total_usd'],
fee=data['fee'], fee_usd=data['fee_usd'], is_coinbase=data['is_coinbase'])
G.nodes["19164e615c48161141ff5c750c08c6a21202446b16720a4f27541fa32cd2c695"]
{'time': '2024-01-10 00:35:31',
'size': 234,
'weight': 609,
'input_count': 1,
'output_count': 2,
'input_total': 3213993,
'input_total_usd': 1477.569,
'output_total': 3163503,
'output_total_usd': 1454.3572,
'fee': 50490,
'fee_usd': 23.2118,
'is_coinbase': 0}
.csv the edgdes dataframe and transaction nodes dataframe¶edges_df.to_csv("edges.csv")
nodes_df.to_csv("nodes.csv")
compute_graph_statistics(G)
MultiDiGraph with 383774 nodes and 105532 edges Number of nodes 383774 Number of edges 105532 Average degree 0.549969513307311 Density 7.165297106718179e-07
We can note that the number of nodes is more than 3 times greater than the number of edges in the graph. This explains why the density of the graph is very low. This means that a transaction block is generally linked to only one other transaction block. Furthermore, given that the number of degrees (i.e. the number of neighbors) is very small (0.5), we can deduce most of transaction blocks receive only one input transaction and don't have any output transation.
draw_subset_graph(G, nodes, nb_nodes=5000, title="Plot of a subset of bitcoin transaction network")
The shape of the graph below seems to confirm the above observations. Namely, the majority of transaction blocks are linked only to another transaction block. The vast majority of transaction blocks appear to have, on average, a single input and output. Visually, it's also clear that a large number of transaction blocks have only one input and no output.
hash_node_transaction = "aa51f19748493441d743a0c408201507b99948795f7e2387a103d9c3f415c56a"
plot_graph_for_specific_node(G, hash_node_transaction)
We focus on one transaction in particular. The choice of this transaction is completely arbitrary and the code can be parameterized. Knowing the hash of the transaction in question, it is possible, thanks to the previous function, to trace all the interactions (input and output) of a block of transactions.
In the graph above, we're interested in the interactions of the transaction whose hash is aa51f19748493441d743a0c408201507b99948795f7e2387a103d9c3f415c56a. We can see that this transaction has many more inputs than outputs. This transaction has many more inputs than outputs. Also, this transaction block is connected to 78 other transaction blocks (77 transactions inputs and 1 transaction output).
The SEC's approval for the creation of Bitcoin Spot ETFs on US exchanges has important implications for the crypto sphere. It will offer a new way of investing in Bitcoin (BTC) for US investors, which could lead to increased demand and a higher Bitcoin price. Bitcoin Spot ETFs could also attract new investors, notably institutional investors.
So the aim of our study is to investigate the impact of this announcement on Bitcoin transactions.
To this end, we will analyze transactions that took place before and after the SEC's official announcement.
The SEC announcement took place on 10-01-2024 at 10:30pm French time, i.e. 4:30pm Wahshington time. Given that the timestamp used by bitcoin is UTC. The SEC's decision took place at 09:30pm.
Consequently, we will analyze the transactions that took place 2 hours before and after the SEC announcement. This will also enable us to work on lighter graphs (containing fewer nodes and vertices).
plot_statistics_bitcoin_transaction(nodes_df)
These graphs show that the SEC announcement on bitcoin ETFs does not seem to have had any impact on the number of bitcoin transactions. We might have expected the number of transactions to explode after the SEC's announcement, but that's not what we're seeing. In fact, the number of transactions decreased after the SEC announcement. Overall, we're seeing around 3,000-4,000 transactions every 15 minutes before the SEC decision. The same phenomenon is also visible after the SEC decision. We note, however, that the maximum number of transactions before the SEC decision is double the maximum number of transactions after the SEC decision.
Regardint to the amounts involved in mining transaction blocks, we can observe that the fees for mining bitcoin transactions are overwhelmingly in the 2-digit range (i.e. between 0 dollars and 50 dollars). However, there are some transactions for which the mining fee is in 3 digits. For example, the maximum amount was 1750 dollars before the SEC announcement and around 1400 dollars after the SEC announcement.
Finally, the weights associated with each bitcoin transaction block are mostly low. However, some transaction blocks stand out for their very high weights. We can imagine that these transactions are important nodes in our graph (large number of degrees). This suggests that certain nodes play the role of pivot nodes in our graph, due to the large number of neighbors. These nodes must have a large number of inputs and outputs.
start_time = '2024-01-10 19:30:00'
sec_announcement_time = '2024-01-10 21:30:00'
end_time = '2024-01-10 23:30:00'
G_before, G_after = build_networks_SEC_announcement(G, start_time, sec_announcement_time, end_time)
print("#### Statistics before the SEC announcement ####\n")
compute_graph_statistics(G_before)
print("\n\n#### Statistics after the SEC announcement ####\n")
compute_graph_statistics(G_after)
#### Statistics before the SEC announcement #### DiGraph with 45415 nodes and 6035 edges Number of nodes 45415 Number of edges 6035 Average degree 0.26577122096223715 Density 2.9260935059919534e-06 #### Statistics after the SEC announcement #### DiGraph with 30245 nodes and 5201 edges Number of nodes 30245 Number of edges 5201 Average degree 0.3439246156389486 Density 5.685832159088557e-06
We can see that the statistics of the transaction graphs before and after the SEC decision on the use of bitcoin ETFs are very similar overall. However, the number of nodes in the graph after the SEC decision is lower than the graph before the SEC decision (-15,000 fewer nodes after the SEC decision). Consequently, the average number of neighbors is higher after the SEC decision.
To do so, we identifying the nodes with the highest degrees
df_top_nodes_degrees_before, nodes_highest_degree_before = nodes_with_highest_degree(G_before, n_node=3000)
df_top_nodes_degrees_after, nodes_highest_degree_after = nodes_with_highest_degree(G_after, n_node=3000)
3000 important nodes (highest degrees)¶plot_node_with_highest_degrees(G_before, nodes_highest_degree_before, type_sec="(Before SEC announcement)")
plot_node_with_highest_degrees(G_after, nodes_highest_degree_after, type_sec="(After SEC announcement)")
We note that the graph before the SEC decision has more nodes with a large number of degrees (yellow and green nodes) than the graph after the SEC decision. On the other hand, the graph after the SEC decision seems to have 2 very important nodes with more degrees (35 neighbors) than the most important nodes (25 neighbors) in the graph before the SEC decision.
Overall, the graphs confirm the statistics we calculated earlier, as they show very similar patterns. Consequently, the SEC's decision seems to have little impact on the dynamics of bitcoin transactions.
10 most important nodes (highest degrees)¶For obvious reasons of computational resource, we decided to concentrate our analysis on the 10 transactions bicoin nodes with the highest degree.
# Select the first 10 nodes with the highest degrees after and before the SEC decision
df_G_top10_before, nodes10_highest_degree_before = nodes_with_highest_degree(G_before, n_node=10)
df_G_top10_after, nodes10_highest_degree_after = nodes_with_highest_degree(G_after, n_node=10)
top10_nodes_before = df_G_top10_before["hash_transaction"].tolist()
top10_nodes_after = df_G_top10_after["hash_transaction"].tolist()
# Build the subgraph of the 1000 nodes with highest degrees after and before the SEC decision
G_top10_before = build_subgraph_top_nodes(G_before, top10_nodes_before)
G_top10_after = build_subgraph_top_nodes(G_after, top10_nodes_after)
print("#### Statistics before the SEC announcement ####\n")
compute_graph_statistics(G_top10_before)
print("\n\n#### Statistics after the SEC announcement ####\n")
compute_graph_statistics(G_top10_after)
#### Statistics before the SEC announcement #### DiGraph with 229 nodes and 219 edges Number of nodes 229 Number of edges 219 Average degree 1.9126637554585153 Density 0.004194438060216042 #### Statistics after the SEC announcement #### DiGraph with 222 nodes and 212 edges Number of nodes 222 Number of edges 212 Average degree 1.90990990990991 Density 0.004321063144592557
By focusing our analysis on the interactions of the 10 largest transactions that took place before and after the SEC ruling. We note that these transactions show similar patterns. Indeed, the descriptive statistics of the two graphs (before and after the SEC decision) are very close. For example, the average number of degrees is identical. Most nodes have 2 neighbors. This suggests that most nodes have a single input and a single output.
plot_distribution_degrees(df_G_top10_before["degrees"], df_G_top10_after["degrees"]);
This graph shows that the graph after the SEC decision has more nodes with higher degrees (over 30 nodes) than the graph before the SEC decision.
This highlights the fact that the graph after the SEC decision has 2-3 nodes playing a pivotal (very important) role.
plot_graph_before_after(G_top10_before, G_top10_after, top10_nodes_before, top10_nodes_after)
Betweenness centrality can be useful in identifying critical nodes in a network, such as key connectors or bottlenecks, and can provide insights into how information or influence spreads throughout the network.
compute_betweenness_centrality(G_top10_before, G_top10_after)
We can see that de betweenness centrality of the both graph (before and after the SEC announcement) are very low. This means that no node in the graph acts as a bridge along the shortest path between pairs of other nodes in both networks. Hence, no nodes semmes to be crucial for maintaining communication and facilitating the flow of transaction between different parts of the network.
The degree centrality measures the number of direct connections a node has. Nodes with high degree centrality are highly connected to other nodes in the network and are often considered important for information dissemination.
$$ \frac{\text{Number of neighbours I have}}{\text{Number of neighbours I could possibly have}} $$Degree Centrality of a node can range from 0 (if the node has no connections) to 1 (if the node is connected to every other node in the network)
compute_degree_distribution(G_top10_before, G_top10_after)
We can note that the distributions of the number of degrees and the distribution of degree centrality are very similar for both network (before and after the SEC announcement). In addition, we can see that the higher the number of degrees, the higher the degree of centrality.
circo_plot_graph(G_top10_before, G_top10_after, node_order='input_total_usd', node_color='input_total_usd')
arc_plot_graph(G_top10_before, G_top10_after, node_order='input_total_usd', node_color='input_total_usd')
transaction_fees_amount(G_top10_before, G_top10_after)
plot_transaction_graph_bis(G_top10_before, G_top10_after)
plot_transaction_bitcoin_network(G, G_top10_before, G_top10_after)
Betweenness is a measure of centrality based on shortest paths. The betweenness centrality for each node is the number of these shortest paths that pass through the node.
Before the SEC announcement
plot_betweenness_centrality_network(G_top10_before, type_sec="Before the SEC announcement")
After the SEC announcement
plot_betweenness_centrality_network(G_top10_after, type_sec="After the SEC announcement")
PageRank is an algorithm used by Google Search to rank web pages in their search engine results. It is a way of measuring the importance of website pages. For example, PageRank works by counting the number and quality of links to a page to determine a rough estimate of how important the website is.
def apply_pagerank_and_plot_side_by_side(G_before, G_after):
# Apply PageRank algorithm for both networks
pagerank_scores_before = nx.pagerank(G_before)
pagerank_scores_after = nx.pagerank(G_after)
# Plotting the graphs side by side
fig, axs = plt.subplots(1, 2, figsize=(15, 6))
# Plot for Before SEC Announcement network
pos_before = nx.circular_layout(G_before)
node_sizes_before = [v * 1e4 for v in pagerank_scores_before.values()] # Adjust multiplier as needed
node_color_before = list(pagerank_scores_before.values())
node_colorbar_before = axs[0].scatter([], [], c=[], cmap='coolwarm', edgecolor='none')
nx.draw(G_before, pos_before, ax=axs[0], with_labels=False, node_color=node_color_before, cmap='coolwarm', edge_color='gray', linewidths=0.5, font_size=8, node_size=node_sizes_before)
axs[0].set_title("Before SEC Announcement")
# Plot for After SEC Announcement network
pos_after = nx.circular_layout(G_after)
node_sizes_after = [v * 1e4 for v in pagerank_scores_after.values()] # Adjust multiplier as needed
node_color_after = list(pagerank_scores_after.values())
node_colorbar_after = axs[1].scatter([], [], c=[], cmap='coolwarm', edgecolor='none')
nx.draw(G_after, pos_after, ax=axs[1], with_labels=False, node_color=node_color_after, cmap='coolwarm', edge_color='gray', linewidths=0.5, font_size=8, node_size=node_sizes_after)
axs[1].set_title("After SEC Announcement")
# Add color bar
cbar_before = fig.colorbar(node_colorbar_before, ax=axs[0], orientation='horizontal', label='PageRank Score')
cbar_after = fig.colorbar(node_colorbar_after, ax=axs[1], orientation='horizontal', label='PageRank Score')
plt.suptitle("PageRank")
plt.tight_layout()
plt.show()
# Call the function with the network graphs
apply_pagerank_and_plot_side_by_side(G_top10_before, G_top10_after)
In comparing the state of the Bitcoin transaction network before and after the SEC's announcement of a Bitcoin ETF, several key insights emerge.
Firstly, the reduction in the number of nodes post-announcement suggests that while there may have been an anticipation or speculation leading to increased activity before the announcement, the actual event led to a consolidation rather than an expansion in transaction activity. Despite the lower number of nodes, the increase in average degree and density indicates that the transactions that did occur were more interconnected, possibly reflecting a more cautious or strategic approach by those transacting in Bitcoin immediately after the announcement.
Interestingly, the descriptive statistics, such as average degree and density, are relatively close before and after the announcement, suggesting that the overall structure and connectivity of the network remained stable. This stability points to the underlying robustness of the Bitcoin network and indicates that major news events may not have as dramatic an immediate impact as might be expected.
The circos and arc plots depicting the top transactions by total input in USD before and after the announcement demonstrate the concentration of transaction volume among a few significant nodes. These focal nodes could represent large institutional moves or aggregations of smaller transactions by intermediaries.
Finally, betweenness centrality metrics suggest that the network does not rely on a few nodes to facilitate transaction flows. A low betweenness centrality across the network indicates a distributed interconnectivity that does not bottleneck at particular nodes. This further underpins the decentralized nature of the Bitcoin transaction network.
Overall, the SEC's announcement appears to have had a more nuanced effect on the Bitcoin network than might be presumed. Instead of a surge in transaction activity, there seems to be a strategic shift in the way transactions are conducted, with the most significant transactions playing a pivotal role. These insights highlight the complex and multifaceted response of decentralized networks to regulatory and external stimuli.